NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception

https://doi.org/10.1145/3716550.3722033

Waite, Joshua R; Hasan, Md Zahid; Liu, Qisai; Jiang, Zhanhong; Hegde, Chinmay; Sarkar, Soumik (May 2025, ACM)

Free, publicly-accessible full text available May 6, 2026
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning

Koirala, Prajwal; Jiang, Zhanhong; Sarkar, Soumik; Fleming, Cody (January 2025, The Thirteenth International Conference on Learning Representations (ICLR 2025))

Free, publicly-accessible full text available January 22, 2026
RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception

Waite, Joshua; Hasan, Md Zahid; Liu, Qisai; Jiang, Zhanhong; Hegde, Chinmay; Sarkar, Soumik (January 2025, ACM/IEEE)

Free, publicly-accessible full text available January 31, 2026
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models

Saadati, Nastaran; Pham, Minh; Saleem, Nasla; Waite, Joshua; Balu, Aditya; Jiang, Zhanhong; Hegde, Chinmay; Sarkar, Soumik (June 2024, IEEE/CVF Conference on Computer Vision an Pattern Recognition)

Full Text Available
Neural PDE Solvers for Irregular Domains

https://doi.org/10.1016/j.cad.2024.103709

Khara, Biswajit; Herron, Ethan; Balu, Aditya; Gamdha, Dhruv; Yang, Chih-Hsuan; Saurabh, Kumar; Jignasu, Anushrut; Jiang, Zhanhong; Sarkar, Soumik; Hegde, Chinmay; et al (July 2024, Computer-Aided Design)

Full Text Available
MDPGT: Momentum-Based Decentralized Policy Gradient Tracking

https://doi.org/10.1609/aaai.v36i9.21169

Jiang, Zhanhong; Lee, Xian Yeow; Tan, Sin Yong; Tan, Kai Liang; Balu, Aditya; Lee, Young M; Hegde, Chinmay; Sarkar, Soumik (June 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

We propose a novel policy gradient method for multi-agent reinforcement learning, which leverages two different variance-reduction techniques and does not require large batches over iterations. Specifically, we propose a momentum-based decentralized policy gradient tracking (MDPGT) where a new momentum-based variance reduction technique is used to approximate the local policy gradient surrogate with importance sampling, and an intermediate parameter is adopted to track two consecutive policy gradient surrogates. MDPGT provably achieves the best available sample complexity of O(N -1 e -3) for converging to an e-stationary point of the global average of N local performance functions (possibly nonconcave). This outperforms the state-of-the-art sample complexity in decentralized model-free reinforcement learning and when initialized with a single trajectory, the sample complexity matches those obtained by the existing decentralized policy gradient methods. We further validate the theoretical claim for the Gaussian policy function. When the required error tolerance e is small enough, MDPGT leads to a linear speed up, which has been previously established in decentralized stochastic optimization, but not for reinforcement learning. Lastly, we provide empirical results on a multi-agent reinforcement learning benchmark environment to support our theoretical findings.
more » « less
Full Text Available
On Consensus-Optimality Trade-offs in Collaborative Deep Learning

https://doi.org/10.3389/frai.2021.573731

Jiang, Zhanhong; Balu, Aditya; Hegde, Chinmay; Sarkar, Soumik (September 2021, Frontiers in Artificial Intelligence)

In distributed machine learning, where agents collaboratively learn from diverse private data sets, there is a fundamental tension between consensus and optimality . In this paper, we build on recent algorithmic progresses in distributed deep learning to explore various consensus-optimality trade-offs over a fixed communication topology. First, we propose the incremental consensus -based distributed stochastic gradient descent (i-CDSGD) algorithm, which involves multiple consensus steps (where each agent communicates information with its neighbors) within each SGD iteration. Second, we propose the generalized consensus -based distributed SGD (g-CDSGD) algorithm that enables us to navigate the full spectrum from complete consensus (all agents agree) to complete disagreement (each agent converges to individual model parameters). We analytically establish convergence of the proposed algorithms for strongly convex and nonconvex objective functions; we also analyze the momentum variants of the algorithms for the strongly convex case. We support our algorithms via numerical experiments, and demonstrate significant improvements over existing methods for collaborative deep learning.
more » « less
Full Text Available
The Stochastic Augmented Lagrangian method for domain adaptation

https://doi.org/10.1016/j.knosys.2021.107593

Jiang, Zhanhong; Liu, Chao; Lee, Young M.; Hegde, Chinmay; Sarkar, Soumik; Jiang, Dongxiang (January 2022, Knowledge-Based Systems)

Full Text Available
Decentralized Deep Learning Using Momentum-Accelerated Consensus

https://doi.org/10.1109/ICASSP39728.2021.9414564

Balu, Aditya; Jiang, Zhanhong; Tan, Sin Yong; Hedge, Chinmay; Lee, Young M; Sarkar, Soumik (June 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))
Root-cause analysis for time-series anomalies via spatiotemporal graphical modeling in distributed complex systems

https://doi.org/10.1016/j.knosys.2020.106527

Liu, Chao; Lore, Kin Gwn; Jiang, Zhanhong; Sarkar, Soumik (January 2021, Knowledge-Based Systems)

Full Text Available

« Prev Next »

Search for: All records